Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from torvalds:master #292

Merged
merged 4 commits into from
Dec 26, 2020
Merged

[pull] master from torvalds:master #292

merged 4 commits into from
Dec 26, 2020

Conversation

pull[bot]
Copy link

@pull pull bot commented Dec 26, 2020

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

robherring and others added 4 commits December 25, 2020 21:58
Commit b9ac0f9 ("PCI: dwc: Move dw_pcie_setup_rc() to DWC common
code") broke enumeration of downstream devices on Tegra:

In non-working case (next-20201211):

  0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1)
  0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)
  0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)

In working case (v5.10-rc7):

  0001:00:00.0 PCI bridge: Molex Incorporated Device 1ad2 (rev a1)
  0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)
  0005:00:00.0 PCI bridge: Molex Incorporated Device 1ad0 (rev a1)
  0005:01:00.0 PCI bridge: PLX Technology, Inc. Device 3380 (rev ab)
  0005:02:02.0 PCI bridge: PLX Technology, Inc. Device 3380 (rev ab)
  0005:03:00.0 USB controller: PLX Technology, Inc. Device 3380 (rev ab)

The problem seems to be dw_pcie_setup_rc() is now called twice before and
after the link up handling. The fix is to move Tegra's link up handling to
.start_link() function like other DWC drivers. Tegra is a bit more
complicated than others as it re-inits the whole DWC controller to retry
the link. With this, the initialization ordering is restored to match the
prior sequence.

Fixes: b9ac0f9 ("PCI: dwc: Move dw_pcie_setup_rc() to DWC common code")
Link: https://lore.kernel.org/r/20201218143905.1614098-1-robh@kernel.org
Reported-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Tested-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Jonathan Hunter <jonathanh@nvidia.com>
Cc: Vidya Sagar <vidyas@nvidia.com>
Commit 660c486 ("PCI: dwc: Set 32-bit DMA mask for MSI target address
allocation") added dma_mask_set() call to explicitly set 32-bit DMA mask
for MSI message mapping, but for now it throws a warning on ret == 0, while
dma_set_mask() returns 0 in case of success.

Fix this by inverting the condition.

[bhelgaas: join string to make it greppable]
Fixes: 660c486 ("PCI: dwc: Set 32-bit DMA mask for MSI target address allocation")
Link: https://lore.kernel.org/r/20201222150708.67983-1-alobakin@pm.me
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Clang errors:

  drivers/mfd/ab8500-debugfs.c:1526:2: error: non-void function does not return a value [-Werror,-Wreturn-type]
          }
          ^
  drivers/mfd/ab8500-debugfs.c:1528:2: error: expected identifier or '('
  return 0;
          ^
  drivers/mfd/ab8500-debugfs.c:1529:1: error: extraneous closing brace ('}')
  }
  ^
  3 errors generated.

The cleanup in ab8500_interrupts_show left a curly brace around, remove
it to fix the error.

Fixes: 886c812 ("mfd: ab8500-debugfs: Remove the racy fiddling with irq_desc")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
…ernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - Fix a tegra enumeration regression (Rob Herring)

 - Fix a designware-host check that warned on *success*, not failure
   (Alexander Lobakin)

* tag 'pci-v5.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: dwc: Fix inverted condition of DMA mask setup warning
  PCI: tegra: Fix host link initialization
@pull pull bot added the ⤵️ pull label Dec 26, 2020
@pull pull bot merged commit 40f7823 into vchong:master Dec 26, 2020
pull bot pushed a commit that referenced this pull request Dec 9, 2021
…_lock

Taking sb_writers whilst holding mmap_lock isn't allowed and will result in
a lockdep warning like that below.  The problem comes from cachefiles
needing to take the sb_writers lock in order to do a write to the cache,
but being asked to do this by netfslib called from readpage, readahead or
write_begin[1].

Fix this by always offloading the write to the cache off to a worker
thread.  The main thread doesn't need to wait for it, so deadlock can be
avoided.

This can be tested by running the quick xfstests on something like afs or
ceph with lockdep enabled.

WARNING: possible circular locking dependency detected
5.15.0-rc1-build2+ #292 Not tainted
------------------------------------------------------
holetest/65517 is trying to acquire lock:
ffff88810c81d730 (mapping.invalidate_lock#3){.+.+}-{3:3}, at: filemap_fault+0x276/0x7a5

but task is already holding lock:
ffff8881595b53e8 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x28d/0x59c

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&mm->mmap_lock#2){++++}-{3:3}:
       validate_chain+0x3c4/0x4a8
       __lock_acquire+0x89d/0x949
       lock_acquire+0x2dc/0x34b
       __might_fault+0x87/0xb1
       strncpy_from_user+0x25/0x18c
       removexattr+0x7c/0xe5
       __do_sys_fremovexattr+0x73/0x96
       do_syscall_64+0x67/0x7a
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #1 (sb_writers#10){.+.+}-{0:0}:
       validate_chain+0x3c4/0x4a8
       __lock_acquire+0x89d/0x949
       lock_acquire+0x2dc/0x34b
       cachefiles_write+0x2b3/0x4bb
       netfs_rreq_do_write_to_cache+0x3b5/0x432
       netfs_readpage+0x2de/0x39d
       filemap_read_page+0x51/0x94
       filemap_get_pages+0x26f/0x413
       filemap_read+0x182/0x427
       new_sync_read+0xf0/0x161
       vfs_read+0x118/0x16e
       ksys_read+0xb8/0x12e
       do_syscall_64+0x67/0x7a
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #0 (mapping.invalidate_lock#3){.+.+}-{3:3}:
       check_noncircular+0xe4/0x129
       check_prev_add+0x16b/0x3a4
       validate_chain+0x3c4/0x4a8
       __lock_acquire+0x89d/0x949
       lock_acquire+0x2dc/0x34b
       down_read+0x40/0x4a
       filemap_fault+0x276/0x7a5
       __do_fault+0x96/0xbf
       do_fault+0x262/0x35a
       __handle_mm_fault+0x171/0x1b5
       handle_mm_fault+0x12a/0x233
       do_user_addr_fault+0x3d2/0x59c
       exc_page_fault+0x85/0xa5
       asm_exc_page_fault+0x1e/0x30

other info that might help us debug this:

Chain exists of:
  mapping.invalidate_lock#3 --> sb_writers#10 --> &mm->mmap_lock#2

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&mm->mmap_lock#2);
                               lock(sb_writers#10);
                               lock(&mm->mmap_lock#2);
  lock(mapping.invalidate_lock#3);

 *** DEADLOCK ***

1 lock held by holetest/65517:
 #0: ffff8881595b53e8 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x28d/0x59c

stack backtrace:
CPU: 0 PID: 65517 Comm: holetest Not tainted 5.15.0-rc1-build2+ #292
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
Call Trace:
 dump_stack_lvl+0x45/0x59
 check_noncircular+0xe4/0x129
 ? print_circular_bug+0x207/0x207
 ? validate_chain+0x461/0x4a8
 ? add_chain_block+0x88/0xd9
 ? hlist_add_head_rcu+0x49/0x53
 check_prev_add+0x16b/0x3a4
 validate_chain+0x3c4/0x4a8
 ? check_prev_add+0x3a4/0x3a4
 ? mark_lock+0xa5/0x1c6
 __lock_acquire+0x89d/0x949
 lock_acquire+0x2dc/0x34b
 ? filemap_fault+0x276/0x7a5
 ? rcu_read_unlock+0x59/0x59
 ? add_to_page_cache_lru+0x13c/0x13c
 ? lock_is_held_type+0x7b/0xd3
 down_read+0x40/0x4a
 ? filemap_fault+0x276/0x7a5
 filemap_fault+0x276/0x7a5
 ? pagecache_get_page+0x2dd/0x2dd
 ? __lock_acquire+0x8bc/0x949
 ? pte_offset_kernel.isra.0+0x6d/0xc3
 __do_fault+0x96/0xbf
 ? do_fault+0x124/0x35a
 do_fault+0x262/0x35a
 ? handle_pte_fault+0x1c1/0x20d
 __handle_mm_fault+0x171/0x1b5
 ? handle_pte_fault+0x20d/0x20d
 ? __lock_release+0x151/0x254
 ? mark_held_locks+0x1f/0x78
 ? rcu_read_unlock+0x3a/0x59
 handle_mm_fault+0x12a/0x233
 do_user_addr_fault+0x3d2/0x59c
 ? pgtable_bad+0x70/0x70
 ? rcu_read_lock_bh_held+0xab/0xab
 exc_page_fault+0x85/0xa5
 ? asm_exc_page_fault+0x8/0x30
 asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x40192f
Code: ff 48 89 c3 48 8b 05 50 28 00 00 48 85 ed 7e 23 31 d2 4b 8d 0c 2f eb 0a 0f 1f 00 48 8b 05 39 28 00 00 48 0f af c2 48 83 c2 01 <48> 89 1c 01 48 39 d5 7f e8 8b 0d f2 27 00 00 31 c0 85 c9 74 0e 8b
RSP: 002b:00007f9931867eb0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 00007f9931868700 RCX: 00007f993206ac00
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00007ffc13e06ee0
RBP: 0000000000000100 R08: 0000000000000000 R09: 00007f9931868700
R10: 00007f99318689d0 R11: 0000000000000202 R12: 00007ffc13e06ee0
R13: 0000000000000c00 R14: 00007ffc13e06e00 R15: 00007f993206a000

Fixes: 726218f ("netfs: Define an interface to talk to a cache")
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jeff Layton <jlayton@kernel.org>
cc: Jan Kara <jack@suse.cz>
cc: linux-cachefs@redhat.com
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20210922110420.GA21576@quack2.suse.cz/ [1]
Link: https://lore.kernel.org/r/163887597541.1596626.2668163316598972956.stgit@warthog.procyon.org.uk/ # v1
pull bot pushed a commit that referenced this pull request Nov 21, 2024
This commit provides a watchdog timer that sets a limit of how long a
single sub-test could run:
- if sub-test runs for 10 seconds, the name of the test is printed
  (currently the name of the test is printed only after it finishes);
- if sub-test runs for 120 seconds, the running thread is terminated
  with SIGSEGV (to trigger crash_handler() and get a stack trace).

Specifically:
- the timer is armed on each call to run_one_test();
- re-armed at each call to test__start_subtest();
- is stopped when exiting run_one_test().

Default timeout could be overridden using '-w' or '--watchdog-timeout'
options. Value 0 can be used to turn the timer off.
Here is an example execution:

    $ ./ssh-exec.sh ./test_progs -w 5 -t \
      send_signal/send_signal_perf_thread_remote,send_signal/send_signal_nmi_thread_remote
    WATCHDOG: test case send_signal/send_signal_nmi_thread_remote executes for 5 seconds, terminating with SIGSEGV
    Caught signal #11!
    Stack trace:
    ./test_progs(crash_handler+0x1f)[0x9049ef]
    /lib64/libc.so.6(+0x40d00)[0x7f1f1184fd00]
    /lib64/libc.so.6(read+0x4a)[0x7f1f1191cc4a]
    ./test_progs[0x720dd3]
    ./test_progs[0x71ef7a]
    ./test_progs(test_send_signal+0x1db)[0x71edeb]
    ./test_progs[0x9066c5]
    ./test_progs(main+0x5ed)[0x9054ad]
    /lib64/libc.so.6(+0x2a088)[0x7f1f11839088]
    /lib64/libc.so.6(__libc_start_main+0x8b)[0x7f1f1183914b]
    ./test_progs(_start+0x25)[0x527385]
    #292     send_signal:FAIL
    test_send_signal_common:PASS:reading pipe 0 nsec
    test_send_signal_common:PASS:reading pipe error: size 0 0 nsec
    test_send_signal_common:PASS:incorrect result 0 nsec
    test_send_signal_common:PASS:pipe_write 0 nsec
    test_send_signal_common:PASS:setpriority 0 nsec

Timer is implemented using timer_{create,start} librt API.
Internally librt uses pthreads for SIGEV_THREAD timers,
so this change adds a background timer thread to the test process.
Because of this a few checks in tests 'bpf_iter' and 'iters'
need an update to account for an extra thread.

For parallelized scenario the watchdog is also created for each worker
fork. If one of the workers gets stuck, it would be terminated by a
watchdog. In theory, this might lead to a scenario when all worker
threads are exhausted, however this should not be a problem for
server_main(), as it would exit with some of the tests not run.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20241112110906.3045278-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants